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(57) Abstract: The present invention is directed at modifying digital images of faces automatically or semi-automatically. In one 
aspect, a method of detecting faces in digital images and matching and replacing features within the digital images is provided. 
Techniques for blending, recoloring, shifting and resizing of portions of digital images are disclosed. In other aspects, methods of 
virtual "face lifts" and methods of detecting faces within digital image are provided. Advantageously, the detection and localization 
of faces and facial features, such as the eyes, nose, lips and hair, can be achieved on an automated or semi- automated basis. User 
feedback and adjustment enables fine tuning of modified images. A variety of systems for matching and replacing features within 
digital images and detection of faces in digital images is also provided, including implementation as a website, through mobile 
phones, handheld computers, or a kiosk. Related computer program products are also disclosed. 
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METHOD. SYSTEM AND COMPUTER PROGRAM PRODUCT FOR 
AUTOMATIC AND SEMI-AUTOMATIC MODIFICATION OF DIGITAL 

IMAGES OF FACES 

This application claims the benefit of U.S. Provisional 
5 Application No. 60/878,669, filed 01/05/2007, and U.S. 

Provisional Application No. 60/797,807, filed 
05/05/2006. 

Field of the Invention 

10 The present invention relates to methods and systems for automatically or semi- 
automatically manipulating and/or modifying digital images. The present invention more 
particularly relates to methods and systems for automatically or semi-automatically 
manipulating and/or modifying digital images of human faces. 

Background of the Invention 

15 While there has been significant work in face detection (see, for example, Nguyen, D., 
Halupka, D., Aarabi, P., Sheikholeslami, A., "Real-time Face Localization Using Field 
Programmable Gate Arrays", IEEE Transactions on Systems, Man, and Cybernetics, Part 
B, Vol. 36, No. 4, pp. 902-912, August 2006), there seems to have been little work in the 
area of face modification, hair restyling and transforming, and "facelifting" for digital 

20 images. 

Specifically, U.S. Patent No. 6,293,284 to Rigg describes a method and apparatus 
utilizing manual user interaction in order to recolor the facial features and to simulate the 
effects of cosmetic products. Unfortunately, this approach does not utilize advanced 
image processing, computer vision or machine learning methodologies and does not 
25 simulate plastic surgery procedures such as facelifts. As such, a user has to spend 
significant time and effort in order to manually enter the parameters for the facial 
recoloring. 

Virtual plastic surgery is the focus of U.S. Patent Nos. 5,854,850 and 5,825,941 to 
Linford et al. and U.S. Patent No. 5,687,259 to Linford. However, the system disclosed 
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in these references is relatively complicated and is intended to be an in-clinic system used 
by professional or experienced operators. Further, the system is not provided on the 
Internet or through mobile and wireless devices, and does not address utilization of 
advanced image processing, computer vision or machine learning methodologies for 
5 estimating the plastic surgery parameters. As a result, operators are required to manually 
adjust the system parameters in order to display the results of plastic surgery in a virtual 
fashion. This system is mostly manual, and does not utilize face localization, feature 
detection, facelifts, or feature/face recoloring on an automatic or semi-automatic basis. 

The method disclosed in U.S. Patent No. 6,502,583 to Utsugi utilizes image processing in 
1 0 order to simulate the effects of makeup on a target face. This system, however, does not 
utilize automatic or semi-automatic face detection, feature detection, or parameter 
estimation and as a result requires manual user input for estimating the necessary 
parameters. Furthermore, this system was not intended for general virtual face 
modifications, and does not perform virtual plastic surgery nor does it perform hair 
1 5 restyling/transformation. 

The method and system of U.S. Patent No. 6,453,052 to Kurokawa et al. utilizes pre- 
stored hair style to restyle a user image. In other words, it is a unidirectional hair 
replacement that does not allow the ability to extract hair styles from one image, and 
place that style in another image. As well, this system or method is only a unidirectional 
20 hair replacement system, not being capable of face readjustment, replacement, or 
modification. Finally, this system requires hair style with basic information to be stored, 
and does not claim an automatic method for such information extraction. 

The system and method of U.S. Patent No. 6,937,755 to Orpaz discloses a manual 
method for visually demonstrating make-up cosmetics and fashion accessories. This 
25 visualization requires manual user inputs in order to work effectively (i.e. it is neither 
automatic nor semi-automatic), and does not allow for hair restyling, advanced face 
modifications such as facelifts, or face feature e-coloring and replacement on an 
automatic or semi-automatic basis. 

A system and method is disclosed in U.S. Patent No. 5,495,338 to Gouriou et al. which 
30 utilizes eye information (such as the inner eye colors) in order to estimate the ideal eye 
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makeup for a given eye. However, this approach is purely a cosmetics suggestion 
system; it does not perform any face adjustment, hair restyling, or face recoloring 
automatically, semi-automatically, or even manually. 

U.S. Patent No. 5,659,625 to Marquardt discloses a method involving a geometric model 
5 to fit the face. These geometric models can be used for face animation as well as for 
cosmetics applications. However, this system, again, does not achieve automatic or semi- 
automatic feature modification, facelifting, or hair restyling. 

A method for locating the lips of a face by bandpass filtering is described in U.S. Patent 
No. 5,805,745 to Graf. However, this reference does not disclose a means for detecting 
10 other features of the face, neither does it describe automatic or semi-automatic face 
modifications, facelifts, or hair restyling. Furthermore, the bandpass filtering method is 
unsophisticated, and does not involve feature extraction methods utilizing edge, color 
and/or shape information, or relative feature and face information processing in order to 
accurately locate the facial features. 

15 The method and apparatus described in U.S. Patent No. 5,933,527 to Ishikawa allows a 
user to specify a search range which is then used to search for specific facial features. 
However, the approach taught therein is not capable of automatic facial feature detection, 
and is incapable of automatic or semi-automatic advancement face processing algorithms 
such as facelifts. Further, there is no mention of an application operable to switch the 

20 features of one face with another automatically or semi-automatically, and there is no 
means for hair restyling or replacement. 

Finally, U.S. Patent No. 7,079,158 to Lambertsen describes a virtual makeover system 
and method. However, the reference does not disclose a means for virtual operations on 
the face or automatic or semi-automatic advanced face modification such as facelifts, and 
25 suffers from a relatively complicated user interface. 

In addition to these prior art references, there are several systems provided on the Internet 
that are operable to perform manual face modification, for example, EZface™, Approach 
Infinity Media™, and others exist. However, none of these systems are capable of face 
feature modification, hair restyling, advanced face processing such as facelifts, either 
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automatic or semi-automatic. As well, all of these systems employ Macromedia™ flash 
technology which places a heavier computational burden on the client/user computers 
and is not easily capable of being widely employed on mobile phones and handheld 
computers. Finally, the user interface complexity of all these systems is problematic as 
5 they are generally difficult to use, complicated to adjust, and far more elaborate to use 
than a simple "choose and modify" approach. 

In view of the foregoing, what are needed are methods and systems for modifying digital 
face images that overcome the limitations of the prior art described above. In particular, 
what is needed is a method and system employing advanced detection and localization 
10 techniques for enabling automatic and/or semi-automatic image modification. Further, 
what is needed is a method and system where facial modifications are processed on host 
servers instead of the user computers. In addition, what is needed is a method and system 
that is simple, easy to use, and capable of being implemented on a variety of devices. 

Summary of the Invention 

15 The object of the present invention is a means of automatically modifying digital images 
of faces and other features of head shots (such as hair and the neck area, for convenience 
referred to together as a "face"), such means of automatic modification providing in 
whole or in part the modification of the digital image. Modification of a digital image of 
a face in accordance with the present invention that is in part automatic is referred to as 

20 "semi-automatic" modification. 

In particular, the present invention provides an automatic or semi-automatic means for 
visualizing the results of a facelift operation, face modification operations, as well as hair 
restyling changes using Artificial Intelligence (AI). 

In one aspect, the present invention provides a method for the modification of face digital 
25 images comprising: detecting a face in a first digital image and a face in a second digital 
image; establishing regions of interest in the face in the first digital image and regions of 
interest in the face in the second digital image; detecting features in the regions of interest 
in the face in the first digital image and features in the regions of interest in the face in 
the second digital image; and matching and replacing one or more of the features in the 
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face in the first digital image with the one or more features in the face in the second 
digital image, thereby defining a modified digital image. The features are, for example, a 
person's eyes, eyebrows, nose, mouth, lips or hair. Processing steps may include 
blending, re-coloring, shifting or resizing the features in the face in generating the 
5 modified image, achieving a photo-realistic result. User feedback and adjustment enables 
fine tuning of the modified images. 

In another aspect, the present invention provides a method of conducting a virtual 
"facelift" in modifying a digital image, the method comprising: detecting a face in the 
digital image; establishing regions of interest in the face in the digital image; detecting 
10 features in the regions of interest in the face in the digital image; smoothing the face in 
the digital image to simulate a facelift; and replacing the features in the face in the digital 
image (since these features are unaffected by the facelift operation), thereby defining a 
modified digital image. User feedback and adjustment enables fine tuning of modified 
images. 

15 In another aspect, the present invention provides a method for virtual hair restyling of a 
digital photo, the method comprising: detecting a face in the digital image; establishing 
the region of interest of the face; establishing the region of interest of the target hairstyle, 
and then to blend the region of interest of the target hair style over the region of interest 
of the face. 

20 Advantageously, the present invention is operable to detect faces within digital images in 
either on an automated basis using detection algorithms or "semi-automated" manner 
comprising of an initial automated estimate of the facial location followed by a user fine- 
tuning the estimates. 

Feature detection and localization techniques are carried out on one or more target photos 
25 that are selected by a user. The user also requests the features of interest, e.g., hair, eyes, 
nose, lips, and other features, and whether blending to create a "facelift" effect should be 
performed. The relevant features are recolored, blended and combined to result in a 
photo-realistic modified face. 
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The present invention also enables fine tuning of size and location of the facial features, 
either automatically or manually, to increase the perceived beauty of the face. 

The digital images of faces can be derived from video, and face modifications can be 
performed on a frame-by-frame basis to generate new images and/or video. Video 
5 tracking can be used to improve the accuracy and reliability of the final video result. 

The face detection and modifications can be performed on either a two dimensional 
photo, or an estimated three-dimensional template of the face within the photo. The latter 
approach allows for compensation of tilted or rotated faces to result in realistic plastic 
surgery visualizations even in any setting. 

10 In yet other aspects, the present invention can be embodied in a variety of systems for 
matching and replacing features within digital images, providing virtual "facelifts", and 
detection of faces in digital images is also provided, including implementation as a 
website, through mobile phones, handheld computers, or a kiosk. A stand alone or 
Internet-connected kiosk operable to perform real-time modification, for example with a 

1 5 built-in camera, is advantageous because there is no need for user owned hardware. 

A simple illustrative user interface is provided allowing the user to select which features 
(e.g., eyes, eyebrows, nose, mouth, lips, hair, etc.) to be selected from a plurality of 
images, consisting of an intuitive 'from-this-image' or 'from-that-image' selection 
criteria for each feature followed by the user selecting advanced single-button options 
20 (such as a "facelift") and pressing a single "modify" button. 

Related computer program products are also disclosed. For example, AJAX 
(Asynchronous Javascript And XML) can be used to implement the present invention as 
a beauty, cosmetics, or plastic surgery application. The advantages of using this 
architecture are that no matter what the client device might be (cell phone, hand held 
25 computer, variety of computer makes, models, and types, computer kiosks, etc.), the 
application can still run successfully through a common Internet browser. 

Accordingly, custom advertising can be delivered based on a user's facial modification 
requests, such that the advertisements, profiles, products, or any other information is 
selectively shown to the user. Also, the invention can be offered as a service to plastic 
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surgeons, beauty salons, cosmetics manufacturers, modeling agencies, police and other 
security agencies, as well as anyone else interested in automated or semi-automated face 
augmentation. 

In addition, the present invention can form the basis of a social network implemented on 
5 the world wide web, mobile, or other electronic platform which allows for user sharing, 
displaying, storing, interacting, and web logging of the user's face modification results or 
the face modification results of other users. 

In another aspect of the invention, a method is provided for modifying digital images 
comprising: detecting a face in a first digital image and optionally detecting a face in a 

10 second digital image, if the location of the face in the first digital image or the second 
digital image has not already been established; establishing regions of interest in the face 
in the first digital image and optionally establishing regions of interest in the face in the 
second digital image; detecting features in the regions of interest in the face in the first 
digital image and optionally detecting features in the regions of interest in the face in the 

15 second digital image; and modifying the first digital image by either matching and 
replacing one or more of the features in the face in the first digital image with the one or 
more features in the regions of interest in the face in the second digital image, thereby 
defining a modified digital image; or isolating from modification the regions of interest in 
the first digital image, modifying the first digital image other than the regions of interest, 

20 and replacing the regions of interest into the modified first digital image. 

In a further aspect of the present invention, a method is provided for modifying a digital 
image comprising: detecting a face in the digital image; establishing regions of interest in 
the face in the digital image; detecting features in the regions of interest in the face in the 
digital image; augmenting the face in the digital image by smoothing selective regions; 
25 and replacing the features in the face in the digital image, thereby defining a modified 
digital image. 

In still another aspect of the present invention a system is provided for modifying digital 
images comprising: a computer linked to a database, the computer including or being 
linked to a utility for enabling one or more users upload, store, retrieve, email, display 
30 and/or manage digital images; a modification utility linked to the computer, the 
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modification utility being operable to provide instructions to the computer that enable the 
computer to detect a face in a first digital image and optionally detect a face in a second 
digital image, if the location of the faces in the first digital image or the second digital 
image has not already been established as well as establish regions of interest in the face 
5 in the first digital image and optionally establish regions of interest in the face in the 
second digital image; detect features in the regions of interest in the face in the first 
digital image and optionally detect features in the regions of interest in the face in the 
second digital image; and modify the first digital image by either matching and replacing 
one or more of the features in the face in the first digital image with the one or more 
10 features in the face in the second digital image, thereby defining a modified digital 
image; or by isolating from modification the regions of interest in the first digital image, 
modifying the first digital image other than the regions of interest, and replacing the 
regions of interest into the modified first digital image. 

In yet a further aspect of the present invention, a computer program product for enabling 
1 5 the modification of digital images is provided comprising: a computer readable medium 
bearing software instructions; and the software instructions for enabling the computer to 
perform predetermined operations, the predetermined operations including the steps of: 
detecting a face in a first digital image and optionally detecting a face in a second digital 
image, if the location of the faces in the first digital image or the second digital image has 
20 not already been established; establishing regions of interest in the face in the first digital 
image and optionally establishing regions of interest in the face in the second digital 
image; detecting features in the regions of interest in the face in the first digital image and 
optionally detecting features in the regions of interest in the face in the second digital 
image; and modifying the first digital image by either: matching and replacing one or 
25 more of the features in the face in the first digital image with the one or more features in 
the regions of interest in the face in the second digital image, thereby defining a modified 
digital image; or isolating from modification the regions of interest in the first digital 
image, modifying the first digital image other than the regions of interest, and replacing 
the regions of interest into the modified first digital image. 



30 



WO 2007/128117 PCT/CA2007/000784 

_ 9 _ 

Brief Description of the Drawings 

A detailed description of the preferred embodiments are provided herein below by way of 
example only and with reference to the following drawings, in which: 

FIG. 1 A illustrates a flow chart of method steps of the present invention; 

5 FIG. IB is a system diagram illustrating one embodiment of the system of the present 
invention; 

FIG. 2 and FIG. 3 illustrate an example web interface for an embodiment of the system 
of the present invention; 

FIG. 4 illustrates a flow chart of method steps of a hair transformation aspect of the 
1 0 present invention; 

FIG. 5 illustrates a further interface for the system of the present invention, in accordance 
with one particular embodiment of the present invention; 

FIG. 6a, FIG. 6b, FIG. 6c and FIG. 7 illustrate feature detection steps for eyes; 

FIG. 8a, FIG 8b and FIG 8c illustrate replacement steps; 

15 FIG. 9a and FIG. 9b illustrate shifting for eye boxes; 

FIG. 10a, FIG. 10b and FIG. 10c illustrate a final face after replacement, shifting and 
blending; 

FIG. 11a, FIG. lib and FIG. 11c illustrate a progression of search box sizes in face 
detection; 

20 FIG. 12 illustrates face symmetry calculation where the average pair-wise square error 
between mirror pixels is used as an indication of the face asymmetry (or, the inverse of it 
as an indication of the face symmetry); 

FIG. 13a and FIG. 13b illustrate example templates for face detection purposes; 
FIG. 14 illustrates the modifications available for a selective automated facelift; 
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FIG. 15 illustrates the interface for a selective automated facelift; 
FIG- 16 illustrates the process of feature detection; 
FIG. 17 illustrates the blending process; 

FIG. 18 illustrates the requirement for the comparative feature adjustment; 
5 FIG. 19 illustrates a scenario where a comparative feature adjustment is performed; 
FIG. 20 illustrates the three dimensional face reorientation process; and 
FIG. 21 illustrates the facelift operation process. 

In the figures, embodiments of the invention are illustrated by way of example. It is 
expressly understood that the description and drawings are only for the purpose of 
1 0 illustration and as an aid to understanding, and are not intended as a definition of the 
limits of the invention. 
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Detailed Description of the Invention 

The term "MODIFACE" as used herein refers to a particular embodiment of the present 
invention that is a system application allowing users to upload, email, send, and display 
digital images of faces, and then apply the automatic or semi-automatic modification 
5 method and software utility of the present invention. In one aspect thereof, MODIFACE 
is a system that can be accessed through the World Wide Web, and is a practical, easy-to- 
use system, providing access to the functions particularized below. 

The object of the present invention is a means of automatically modifying digital images 
of a face and other features of head shots (such as hair and the neck area, for convenience 
10 referred to together as a "face"), such means of automatic modification providing in 
whole or in part the modification of the digital image. Modification of a digital image of 
a face in accordance with the present invention that is in part automatic is referred to as 
"semi-automatic" modification. 

The present invention is a method and system modifying the facial features of a digital 
1 5 image with those of another digital image, changing the hair of one photo with the hair in 
another photo, and/or performing a "virtual facelift" or face cleansing/smoothening 
operation on a desired image. These actions have many steps in common, and are 
explained below. 

As illustrated in the flowchart of FIG. 1 A, the first step in one particular implementation 
20 of the present invention, is to upload the one or more images to a web server. 

The images are generally uploaded to a web server connected to the Internet, such web 
server incorporating standard resources and functionality generally used for a web server 
that is operable to receive uploaded digital images from a plurality users, store the digital 
images, and enable users to access selected digital images based on hierarchical access 
25 thereto, as well as sort and manage digital images to which they have access. It is also 
known how to provide such a web server that is operable to provision mobile devices, 
including as particularized below. A representative embodiment of such architecture is 
illustrated in FIG- IB. The web server (100) is linked to a database (102) and to a server 
application (104). The server application (104) incorporates the standard features 
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described above, and linked to the database (102) provides the image storage, retrieval, 
sorting and management features mentioned above. In accordance with the present 
invention, the server application (104) also incorporates a modification utility (106), 
which is programmed in a manner that is known to incorporate the functionality 
5 described below. 

One aspect of the invention therefore is a face modification system that incorporates the 
functionality of the modification utility (106). FIG IB illustrates one particular 
implementation of the face modification system, i.e. implementation as a web service 
provisioned by web server (100) to remote computers (personal computers or wireless 
1 0 devices for example). 

It should be understood that the present invention contemplates numerous other 
implementations as well. For example the face modification system of the present 
invention may include a personal computer, and loaded thereof a client application 
incorporating the modification utility. It should also be understood that the computer 

1 5 program of the present invention can be provided as a network application, accessible to 
a plurality of computers, as an ASP solution delivered to a plurality of personal 
computers, or to a plurality of web server that in turn provision remote computers (for 
example by providing the functions of the present invention as a means of enhancing the 
features made available by web servers providing on-line community functionality). The 

20 face modification system, or aspects thereof, can also be integrated with numerous 
existing tools, for example, software tools used by cosmetic surgery clinics. It should 
also be understood that the system of the present invention can work with mobile phones 
and handheld devices such that user images and face modification requests are sent via 
email (or mobile multimedia message) to the system of the present invention, and the 

25 result is returned via email (or mobile multimedia message) back to the user. 

Also, as explained below, the face modification system of the present invention, can be 
provided as a kiosk, 
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In one particular implementation of the present invention, illustrated in FIGS. 2 and 3, 
the web server (100) (shown in FIG. IB) presents a web page that permits users to 
upload images or select images already available on the web server (100) and initiate the 
face modification features described below. 

5 In one aspect of the present invention, the face modification system first detects the 
location of the face and facial features of a target digital image, including the eyes, nose, 
and lips. 

Prior to any of these steps, optionally a smart facial image enhancement is performed 
which involves taking a digital image, automatically or semi-automatically (comprising 

10 of an initial automatic identification follows by user intervention) identifying the face, 
and optionally performing histogram equalization or contrast adjustment on the face 
followed by blending the equalized histogram onto the original digital image. The 
blending approach involves a gradual blending such that it is more heavily equalized in 
the center of the face and less so around the edges. Also, only partial histogram 

15 equalization is performed in order to not upset the balance of colors on the face 
significantly, which can cause distortion. In one particular aspect of the invention, this is 
accomplished by performing a weighted or partial image histogram equalization which 
places more weight on the digital image pixels near the boundaries than digital image 
pixels near the center. 

20 In one particular aspect of the present invention, the method and system described utilizes 
computer vision and machine learning algorithms in order to detect these features. In the 
case of the face, this consists of matching a probabilistic face model, or a face template, 
to the various locations of the digital image in order to find the most probable location of 
the face, as illustrated in the examples provided below. This action is performed at 

25 multiple scales and in a hierarchical fashion in order to detect different face sizes as well 
as increase the efficiency of the computations. Pre-computations such as detecting 
specific skin-like colors in an image can be used to speed up the operation even further. 

The core algorithm for face detection can be implemented in software or in custom 
hardware (e.g., field programmable gate arrays or very large scale integrated circuits). 
30 The methodology for efficient face detection and localization on field programmable gate 
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arrays has been described, for example, Nguyen, D., Halupka, D., Aarabi, P., 
Sheikholeslami, A., "Real-time Face Localization Using Field Programmable Gate 
Arrays", IEEE Transactions on Systems, Man, and Cybernetics, Part B, Vol. 36, No. 4, 
pp. 902-912, August 2006. This particular face recognition technique consists of a block 
5 by block implementation of the face searching system in digital logic running on a field 
programmable gate array. 

The detection of the features such as eyes, nose, and lips is performed as follows, in one 
aspect of the present invention. First the located face is divided up into regions of 
interest which may contain the eyes, nose, and lips. These regions may be overlapping. 

10 In the eye region, the image intensity gradients of the region are extracted and the region 
with the largest intensity gradients within an eye template is selected as the eye location 
(see FIG. 6a, FIG. 6b, FIG 6c and FIG. 7). The size of the eye template is proportional 
to the size of the detected face. The same highest gradient oval detection is performed on 
the right half of the region. The resulting highest-gradient ovals are used as the presumed 

1 5 eye locations. 

The lips are detected next by a similar procedure, where the region with the largest edge 
gradients within a lip template is selected as the lip. 

The location of the nose is determined based on the positions of the eyes and the lips. 
The nose will have a bottom that just slightly overlaps with the lips, a top that touches the 
20 edge of the eyes, and a width that is in proportion to the face. 

Once these features have been located, they can be combined with the detected features 
of another photo (detected using the same procedure) by blending either a face or facial 
feature into another digital image. Prior to the blending, the feature locations are 
preferably adjusted to fine tune the previous feature locations and by 'matching' the 
25 locations of the features of the two faces. This matching is done by comparative 
adjustments to the detected eye, lip and nose locations and slight adjustments to align the 
gradient intensities of the eyes and lips. 



Once the feature locations have been finalized, the desired feature is color adjusted and 
blended on top of the original feature. For example, for switching the eyes (or nose or 
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lips) of two photos, once the eyes (or nose or lips) in both images have been localized, 
the eye (nose or lip) from the first image (see FIG* 8a) is smoothly blended into the eye 
(or nose or lip) box of the second image (see FIG, 8b) resulting in new combined image 
(see FIG 8c). 

5 As used herein, the term "box" should be understood to include any shape suitable to 
focus in on a region of interest, whether the area of interest relates to the eyes, lips, nose 
or otherwise. For example, an eye box can be round, square, rectangular, oval, 
pentagonal, etc. 

Prior to this blending, the features can be recolored (by performing histogram 
10 transformation on each of the color histograms in order to equalize the red, green, and 
blue average pixel values for each image) to the features' histograms of the previous 
features (the features which are being replaced). This color transformation is preferably 
performed when changing the eyes, nose, and lips. In order to improve the level of 
realism of the final result, the re-coloring is applied mainly using the color values of the 
15 outer areas of the features and less so in the center of the feature. For example, in the 
case of the eye, the inner eye color of the desired eye makes a smaller contribution to the 
color histograms than the color around the eye. This is further illustrated by FIG. 9a and 
FIG. 9b with the capture of the color transformation for changing eyes. 

Different blending masks can be applied to the recolored areas and original features and 
20 the masked layers are then added to result in the final features. The mask shapes for each 
feature are custom designed for the general shape of the feature. Depending on the mask, 
the blending consists of gradient filling whose center consists entirely of the first eye (or 
nose or lip) and whose borders (as defined by the feature mask) consist entirely of the 
second eye (or nose or lip) box. In between the center and the border, the ratio of the first 
25 eye (or nose or lip) and right eye (or nose or lip) gradually changes in order to result in a 
smooth contour and smooth blending. Similarly, this blending can be performed for 
other facial features (or even for the entire face), as requested by the user. This is further 
illustrated in FIG. 10a, FIG. 10b and FIG. 10c with the final appearance of a face after 
the replacement, shifting and the blending steps have been performed. 
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The above achieves the results of exchange of features between selected digital images of 
faces, in accordance with the present invention. 

As stated previously, another aspect of the present invention is the performance of a 
virtual facelift or face cleansing/smoothening operation. This is done by first detecting 
the locations of the eyes, nose, and lip as outlined above, smoothing/lifting the face by 
blurring it (or, as a more complicated operation, retouching the face) in such a manner 
that the blurring (or retouching) is most severe in the center of the face and gradually 
decreasing in intensity further away from the face center, and finally by re-blending the 
initial (non-smoothed) face features (eyes, nose, and lip) on top of the smoothed face. As 
a more advanced operation, instead of blurring the center of the face the blurring can be 
applied selectively to regions such as below the eyes, between the eye brows, and around 
the lips to simulate customized facelift or facelift product applications. 

Outlined below are the specific details of a subset of the procedures claimed in this 
patent: 

Feature Detection 

The main facial features (lips and eyes) are detected by the following set of steps: 

1 . The gradient magnitude image of the face is obtained (this is done by subtracting 
each pixel from the pixel just above it, or, from taking the square root of the 
square of the vertical pixel difference plus the square of the horizontal pixel 
difference). 

2. We focus on the specific locations of the face where we expect to find specific 
features. 

3. A search is conducted to find a small sub-region (the sub-region size is chosen in 
proportion to the face size) within each focused region such that the total gradient 
magnitude in each sub region is maximized. Please note that usually this 
summation is done on a weighted basis using an appropriate feature mask. 

4. Once the lip and eye locations have been found, the nose location is estimated as 
follows 

a. Nose_height=0.4*face_height 



WO 2007/128117 



-17- 



PCT/CA2007/000784 



b. Nose_width=0.4*face_width 

c. Nose_left=(eyeshorizontal mid_point+lip_horizontal_mid_point)/2- 
Nosewidth/2 

d. Nose_top=(lip_top+lip_height* 0 . 3 -Noseheight) 
FIG. 16 illustrates this process of feature detection. 

Blending 

The blending of a feature is accomplished as follows: 

1 . The desired feature is recolored to match the color of the original feature. 

2. The result of step 1 is multiplied by a feature mask. 

3. The original feature is multiplied by the inverse (i.e. one minus each of the mask 
values, which range from 0 to 1) of the feature mask. 

4. The resulting images of steps 2 and 3 are added pixel by pixel to make the final 
blended feature image. 

FIG. 17 illustrates the blending process. 

Recoloring 

Recoloring of the desired feature to match the color of the original feature (especially at 
the boundaries) is accomplished as follows: 

1 . The weighted average (i.e. weighted mean) of each of the red, green, and blue 
channels of the original feature are calculated as follows: 

a. A feature color mask is multiplied pixel-by-pixel with each of the red, 
green, and blue channels of the original feature image; and 

b. The resulting pixel values are summed across each of the red, green, and 
blue channels, and divided by the total sum of the pixels in the feature 
color mask - we denote these averages as Or, Og, Ob. 

2. The weighted average (i.e. weighted mean) of each of the red, green, and blue 
channels of the desired feature are calculated as follows: 
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a. A feature color mask is multiplied pixel -by-pixel with each of the red, 
green, and blue channels of the desired feature image; 

b. The resulting pixel values are summed across each of the red, green, and 
blue channels, and divided by the total sum of the pixels in the feature 

5 color mask - we denote these averages as Dr, Dg, Db. 

3. The value of each of the pixels in the desired image are modified by added the 

value Or-Dr to each of the red channel pixels, Og-Dg to each of the green channel 
pixels, and Ob-Db to each of the blue channel pixels, resulting in the recolored 
desired image. 

10 

Comparative Feature Adjustment 

On certain occasions it is possible to have two feature boxes (one on the original face, 
one on the desired face) where both boxes are located correctly but, relative to each other, 
are not at the same locations on the face. In this scenario, the resulting modified face will 
1 5 have features that will look incorrect. This comparative feature adjustment situation is 
best illustrated in FIG. 18. 

As a result of a modified face possessing features that appear incorrect, whenever 
features are being replaced on the original face, a comparative adjustment is performed to 
make sure that, all features are at the same relative locations. This is accomplished by the 
20 following steps: 

1 . Obtaining the gradient magnitude for both the desired features and the original 
features. 

2. Finding an alignment between the two located features such that their gradient 
magnitudes have the highest degree of overlap. 

25 3. Adjusting the feature location of the desired face according to step 2. 
This process is further illustrated in FIG. 19. 

Location Adjustment Based on Facial Beauty Scores 

The localized features can be optionally processed by a "beauty" filter which utilizes 
mathematical measurements of the facial features in order to estimate the validity of the 
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features from a classical "beauty" perspective, in a manner that is known. (Aarabi, P., 
Hughes, D., Mohajer, K., Emami, M., "The Automatic Measurement of Facial Beauty", 
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 
Tucson, Arizona, October 200 1.) If the resulting feature locations are deemed to be 
5 outside the range of acceptability, they are modified according to the feature location 
beauty specifications. For example, if the eye and lip locations represent a highly 
asymmetrical face, they are slightly modified to produce a more symmetrical face. 

Applications and Implementations 

As stated earlier, the face modification system can be embodied in a variety of ways. For 
10 example, the present invention can be implemented through a common website on the 
World Wide Web. As stated earlier, this consists of the MODIFACE system being 
implemented on a computer server (or servers) which takes in user uploaded photos, a set 
of directives (such as arrows indicating which features and/or face should be included in 
the final result - as illustrated in FIGS. 2 and 3), processes them as outlined above, and 
1 5 generates the resulting photo for viewing by the user. The system also allows for the user 
to intervene by adjusting the located face and facial features, and/or to issue new 
directives for a new modified result. In other words, the system is initially in a full 
automatic mode which tries its best to perform the modifications accurately, but allows 
the user to make subsequent adjustments to refine the results (i.e. hence becoming semi- 
20 automatic). 

In accordance with one aspect of the present invention, a user first obtains one or more 
digital images of a face and uploads it to the MODIFACE website. The resulting picture 
is shown with face and feature locations, allowing a user to change locations and repeat 
the operation. 

25 Another implementation of the system of the present invention involves a user who 
captures a photo through a scanned image, a digital camera, a camera-phone, and any 
other device, system, or method of capturing an image, and sends this image (through 
email, a website, text-messaging, or other mobile, wired, wireless, or method for 
communication) to a central MODIFACE server (e.g. web server (100) illustrated in 

30 FIG* lb) along with a set of directives about what modifications should be performed. 
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This server then automatically performs the requested operations as outlined above, and 
sends the resulting photo and photo parameters (such as the locations of the face and 
facial features) back to the user through email, text-messaging, or any other wired or 
wireless means of communication. There can be further user intervention at this point, 
5 such as adjusting the feature locations or directives, and this would result in another 
iteration of the adjustments and photo sent to the MODIFACE server and the results sent 
back to the user. 

Other implementations of the present invention include kiosks located in malls or stores, 
for example, or other location which can capture a photo of an individual and perform the 
10 MODIFACE operations as requested by that individual, allowing intervention and 
adjustments as described above. In this scenario, the MODIFACE system could either be 
external to the kiosk, or internal which would allow the device to operate independently. 

Another implementation of the present invention is as a stand-alone or server-based 
kiosk. This system, in either the stand-alone or server-based modes, would consist of a 
1 5 stand, a keypad or keyboard or a set of buttons, a display (possibly a touch screen for 
easier operation), as well as a camera mounted on the top. The camera captures images 
of a user standing in front of the kiosk and displays their image along with any requested 
modifications, on the kiosk display, using the approach described above. 

This approach could even be extended to mobile phones and handheld computers as well 
20 as digital cameras, which can contain mobile versions of the MODIFACE system for 
direct utilization by a user. In this scenario the MODIFACE system would be embedded 
(running on a custom chip or as part of the device system itself) in the device directly, 
without the need for external communication. 

An alternative to mobile implementations is a custom developed electronic device which 
25 is operable to capture user photos and perform face modifications by uploading photos to 
it or by means of an onboard camera. This device or digital pad would allow all of the 
modifications to be performed without the need for external communication. 

As mentioned above, the present invention can also be implemented as a web service, 
whereby face detection and/or face modifications are carried out on digital images and 
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the location of the detected face and facial features and the likelihood that the submitted 
images contain faces (i.e. the face score) can be remotely returned to a user. 

The present invention can be extended to searching, filtering, or processing the results of 
a visual search query submitted by a user to a database or web search system, such that 
5 the images are sorted based on their likelihood of containing a face. In this particular 
embodiment of the present invention, the visual search query would rely on the 
techniques described herein for analyzing a digital image to determine whether there are 
facial features, and if facial features exist, determining their likely location within the 
digital image. 

1 0 The face detection aspect of the present invention can also be used for identifying faces 
in particular magazines, internet sites, or newspapers, and automatically selecting the 
current popular faces that are most frequently mentioned in the news or in particular 
media. For example, an application in accordance with the methods of the present 
invention can be created operable to automatically search popular media (newspapers, 

1 5 internet sites, magazines, etc.) and detect faces in images and track the most frequently 
cited names, returning the images that have the highest likelihood of containing a face. 

The present invention can also be extended to conduct face detection and face feature 
extraction and replacement for forensic, investigative, or other police/security 
applications. 

20 The present invention can also be extended so that face detection and face feature 
extraction and replacement is used for online, mobile, or handheld gaming applications, 
e.g., a game whereby a user has to guess the name of the original face from which each 
feature comes from. 

The present invention also includes using face detection and face feature extraction as 
25 well as face modification (such as facelifting) for targeted cosmetic advertisement and 
beauty suggestion purposes, such that cosmetic or beauty advice given is generated 
automatically based on the facial features and facial characteristics of a submitted digital 
image. 
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Extension to Videos 

It should be understood that the method and system of the present invention, because of 
its "automatic" nature, can be applied to videos as well. According to this aspect, a video 
segment of a user can be selected for modification and a target photo or video would be 
5 selected with a set of directives about which feature to include from either video. Here, 
the MODIFACE method and system would be applied on a frame by frame basis, which 
would then be improved by means of video tracking to fine tune the located features. 
Video tracking in this situation can resolve occasional errors in the location of the 
detected face and facial features by utilizing the relative continuity of the video frames. 

1 0 For this, the input to the video MODIFACE system would be one or more videos as well 
as a set of directives. The output of the system would be a video with the modified 
features of the input video(s) and with the tracked feature locations which can be 
modified for further modification iterations, as in the previous cases. 

Another realization of the system for video applications would be automatically 
15 performing facelifts on the video of a user, by performing repeated face detections, 
followed by full or partial facelifts using the system of this invention. 

Example Interface for Invention 

The interface for the MODIFACE system can take many different forms. One example 
web-based interface is depicted in FIG. 2. This interface allows for up to two images to 
20 be uploaded to the website, as well as for a set of input buttons (i.e., "hair", "face", 
"eyes", "nose", "mouth", "Facelift" and "Modify") indicating which feature(s) to select 
from which face, whether to perform a facelift, and initiating the face modification. 

Once the user has entered the parameters and images (or videos), the system of the 
present invention is operable to perform automatic face modification and show the results 
25 to the user. Also shown to the user are the estimated locations of the facial features, 
which can be dragged and moved by the user in order to yield a more accurate 
modification, if necessary to achieve an accurate result, or if otherwise desired by the 
user. This particular user intervention also illustrates the semi-automatic nature of the 
present invention. It should be understood, however, that the automatic face modification 
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in accordance with the present invention, even if user intervention is involved, provides 
significant reduction in time over attempting to perform the face modification manually. 
If user intervention is required or desired, the user simply moves the feature or face 
boxes, issues a new set of directives (or keep the old ones), and selects the "Modify" 
5 button once again. This feature of the present invention is illustrated in the context of a 
representative interface in FIG. 3. 

Audio Interface 

The present invention and interface requires a user to choose a set of modifications by 
selecting which feature/face/hair should be extracted from which photo. This type of 
10 selection can be set using a computer mouse or pointing device, using a touch screen 
interface, using a set of buttons or a keyboard, or with an acoustic interface. 

The first type of acoustic interface that can be employed as input is based on speech 
recognition. By saying one or more directions, e.g., "right" or "left", the system can 
make the appropriate selection by: (i) capturing the sound of the user with one or more 
15 microphones; (ii) performing speech recognition on the sound; (iii) determining if the 
user had made a valid request; and (iv) making the appropriate user selection. 

Another type of interface involves sound localization. Here, two or more microphones 
are used to determine the direction from which the sound is coming from (based on the 
time of arrival of the sound at the two microphones). If the sound is coming from the 

20 right side of the device, computer, or kiosk, then the right option is selected. If the sound 
is coming from the left side of the device, computer, or kiosk, then the left option is 
selected. A similar selection criterion can be employed for up and down options as well. 
In the case of up-down and left-right motions, a minimum of three microphones would be 
needed, which based on their geometric positions perform sound localization in a two- 

25 dimensional space. 

Hair Transformation 



Another aspect of the present invention is a hair transformation or restyling technique. In 
one aspect of the present invention, this feature is part of the face modification system of 
the present invention. Hair transformation and restyling is also achieved in a manner 
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similar to the facelift. This is accomplished by extracting the face (after the user 
requested modifications are performed) and blending it on top of the face in a photo that 
contains the desired hair. In order to improve the effect, the size of the blended faces 
should be slightly (e.g., 10 to 20%) larger than the actual size of the faces to yield the 
5 best hair restyling result. Also, the bottom of the virtually transplanted face should be 
extended (e.g., 30 to 100% of the face height) in order to capture the neck, jaw, and chin 
structure of the desired face in the new photo. Once the hair has been restyled, the user 
can fine tune the results by adjusting the face size and locations in either photo and 
repeating this procedure. The basic method for the automatic hair transformation and 
1 0 restyling is depicted in FIG, 4. 

E yebrow Modification 

By using the knowledge about the location of the eyes of a face, the location of the 
eyebrow can be assumed to be directly above the eye. This allows the eyebrows of one 
photo to be placed in the place of the eyebrow of a second photo, as follows: 

15 1 . The faces in each photo are detected, manually or automatically, and enhanced by 
performing a partial histogram equalization. 

2. The features of the face including the eyes are localized automatically or semi- 
automatically. 

3. If the user has elected to do so, the desired eyebrow box, defined as the box 
20 directly on top of the eye, belonging to the first photo is blended on top of the 

eyebrow box belonging to the second photo. This blending is performed as 
before, when a blending mask being used to define the amount of the original 
eyebrow, the desired eyebrow, and the recolored and desired eyebrow. 

As described above, user adjustments of the feature boxes can be allowed to fine tune the 
25 locations of the features. In the case of the eyebrows, moving the eye box will adjust the 
locations from which eyebrows are extracted and where eyebrows are placed. 
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Utilization for Custom Advertisement 

When the user has selected certain operations such as hair change or facelift, the system 
can utilize this information to get a sense of what type of information the user is 
interested in, and then to provide custom advertising. For example, if a person uploads a 
photo and then requests a hair change operation, the system can show hair stylist 
advertising on the results page. The geographic location information of the user along 
with the custom operation request can be used to further narrow the range of products and 
services that the user would be interested in. Custom advertisements in this scenario 
would work as follows: 

1. The user uploads one or more photos and requests a set of operations to be 
performed on these photos. 

2. The system performs the requested operation, and returns the resulting image 
along with the facial feature locations to the user. Along with this information, 
the system also sends an advertisement that is custom tailored towards the 
geographical location of the user, the requested operation, and/or any other 
customization information that may able available about the user. 

3. If the user clicks on the advertisement either they will be directed to a target 
website or Internet destination, or to a custom page with further information about 
the specific operation and the advertisers. 

Custom advertisements can also be delivered in mobile and email applications where the 
image and the necessary information is sent by the user to the MODIFACE email/server 
(via email, text messaging, or other means), the necessary operations are performed by 
MODIFACE, and emailed/sent back to the user or a destination specified by them. 
Along with the result that is sent back, a set of advertisements custom tailored to the 
geographical area, requested operation, and/or any other custom information about the 
user will also be sent. 

For example, if a user sends an image to MODIFACE through email requesting for a 
facelift operation, MODIFACE would send back the resulting face-lifted image along 
with advertisements from dermatologists, skin specialists, and/or cosmetic products that 
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are customized and relevant to the user. A similar procedure would occur for selective 
facelift and face operations. 

Utilization within a So cial Network 

Another aspect of the present invention is the definition of a web-enabled social network. 
5 A social network generally consists of an online site that allows users to create accounts 
and login, to select and add others as friends, and in accordance with the present 
invention to upload photos or photo modification results to their accounts. They can also 
view public photos (all photos uploaded or modified by users can be either public or 
private) of their friends or others on this social network. They can also comment on their 
10 own profile, or their own photos, or those of their friends. Finally, they can select any of 
their photos to be included as a "celebrity" photo which others can use for the basis of 
their own face modifications. Either creating an account or the act of make a personal 
photo a "celebrity" photo can be monetized through a user charge or through online 
advertisements that target the specific actions and requests of the user. 

1 5 An example of an interface for the face modification social network is illustrated in FIG. 
5. As shown therein, a relatively large image selection area is provided, in this case 
operable to display 9 images, and with controls to allow a user to flip between different 
"pages" of images. Towards the bottom of the screen a MODIFACE image modification 
area is provided, with similar controls as shown in FIG. 2 (i.e., hair, face, etc.). However 

20 in this case there is also provided a scrolling image bar to allow the user to easily select 
images to modify. 

Dynamic Images 

Another application of the present invention is that of dynamic images. Currently, an 
image for Internet applications is a static file stored on a server. The present invention is 
25 operable to generate dynamic images of a person's face, such that certain elements of the 
image changes on a monthly, weekly, daily, hourly, or any other time-based or request- 
based changing schedule. 

For example, a user may upload their own photo and they would get a link to a version of 
their photo that every hour (or over some other time interval) would include their face 
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with a different hair style. The target hair styles, in this example, are chosen either from 
a celebrity/model list specified by the user or from all the celebrity image collections 
available to the invention. The steps for obtaining dynamic images would be as follows: 
(i) a user uploads a photo and selects a set of celebrity/model photos with which to 
5 replace features; (ii) the user selects the features to be changed (i.e. face, facial features, 
hair, etc.); (iii) the user selects a schedule for changes (i.e. weekly, monthly, daily, per 
download, etc.); (iv) the user is given a URL to an image file stored on the server which 
is modified based on the user entered parameters and schedule; and (v) the user embeds 
their image, using the generated URL, in their own site, in other sites, or in any other 
10 application. 

Three Dimensional Face Reorientation 

In situations where the target and/or desired face are tilted to one direction or rotated, the 
15 modified face will not look natural. The way to resolve this issue is by performing 
(optionally, only if a user requests so) a three dimensional face reorientation or 
correction. 

It is assumed that both the final reoriented face and the original face lie on two different 
20 two-dimensional planes in three-dimensional space. The goal is to shift one plane to 
become horizontal to the look direction of the camera with which the photo was taken. 
FIG. 20 further illustrates this process. 

A two-dimensional plane in three-dimensional space can be projected onto another plane 
according to the following algebraic formula: 



X 




~x~ 
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= p- 
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1 



Where (x,y) are a point on the first plane, (X,Y) are a point on the second plane, and P is 
a 3 by 3 matrix which contains the projective transformation parameters. 

30 
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When we have located the left eye, the right eye, and lips, we obtain a set of coordinates 
on the face plane (here, we are assuming that the face is in fact just a plane), as follows: 



Lip location = (mx,my) 
5 Left eye location = (lx,ly) 
Right eye location =(rx,ry) 



We also have the ideal location of the facial features on a frontal face. This ideal location 
will be denoted as follows: 

Ideal lip location = (Mx,My) 
Ideal left eye location = (Lx,Ly) 
Ideal right eye location =(Rx,Ry) 



15 The goal is to project the original face plane onto the ideal face plane, as follows: 



mx lx rx 
my ly ry 
1 1 1 



Mx Lx Rx 
My Ly Ry 
1 1 1 



Three points on each plane are enough to find the project transformation between the 
20 plans, as follows: 



mx lx rx 
my ly ry 



1 



1 1 



Mx Lx Rx 
My Ly Ry 
1 1 1 



25 



One the projection matrix P is found, it is applied to every point on the image in order to 
result in a re-oriented face. The projected face and feature locations are also determined 
through this transformation. 
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Iterative Face and Facial Feature Detection 

In order to improve the results of face detection, and to make the face detection more 
reliable and consistent, an iterative approach is used to automatically check the validity of 
a detected face, as follows: 

5 1 . For each detected face, a new face score is computed, where this face score is a 
multiplication of the original face score and a feature score. 
2. The feature scores are a multiplication of a parameter that measures the deviation- 
from-norm of the feature positions and individual validity metrics (including 
gradient magnitude symmetry) for the features. 
10 3. If the facescore, featurescore, or their multiplication, is below a preset threshold, 
the face detection is performed again to find another region within the image that 
has the maximum facescore* featurescore. 
In other words, the feature likelihoods are included in the face score calculations. 
Another benefit of this approach is that by detecting features at the face detection stage, it 
15 is possible to compensate for tilts and rotations easily based on the locations of the 
features. 

Facelift 

This section outlines in detail the procedure for the facelift operation. For the facelift 
operations, the following steps are performed: 

20 1. The facial features (eyes, nose, and lips) are detected and extracted (i.e. copies are 
made of these original features). 
2. The face is 'lifted' by smoothing it, either selectively to give the appearance of 
specific operations, or in entirety with uniform smoothing in the middle of the 
face and less smoothing around the edges of the face. 
25 3. The original features are blended back onto the smoothed/lifted face 
FIG. 21 illustrates the steps followed for the facelift process. 

Selective Automated Facelift 
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The invention can also be utilized for visualizing the effects of a detailed or selective 
facelift operation or plurality of operations. A user can request selective partial facelifts, 
aside from a full facelift which would operate on the entire face. For example, a user 
could be shown an image of a template face or the user face, and the user could then 
5 select which parts of the face should undergo plastic surgery. This particular process 
relies on the methods described above. The steps involved with this process are as 
follows: 

1 . The system automatically detects the face of the person in a user uploaded or 
selected photo. 

10 2. The system detects the features (eyes, lips, nose) within the face in order to get 
bearings on the face. 

3. The user selects either a full facelift option, or a subset of partial options 
including but not limited to forehead lifts, eyebrow lifts, bellow eye lifts, inter- 
brow lifts, outer cheek lifts, inner cheek lifts, lip enhancement and lip lifts, as well 

15 as jaw restoration and jaw lifts. 

4. Based on the user selected partial operations, the system performs the operation as 
follows: 

a. The system first extracts the face features (eyes, lips, nose); 

b. The system performs the selective surgery visualizations by either 
20 smoothing (convolving with a two-dimensional smoothing filter) the 

specific region or adjusting its size (for example, in the case of a brow 
lift or lip enhancement, the vertical dimensions of the eyebrows or lips 
are increased in order to give the impression of feature adjustment); 
and 

25 c. The extracted features are blended back onto the face, either at their 

original size or slightly adjusted in size in order to give the impression 
of an adjustment. 

An example of the possible modifications available for the selective automated facelifts is 
shown in FIG 14. 



30 



An alternative realization of a selective facial lift and facial augmentation system is 
described below, which consists of a user uploading a photo to a computer server and the 
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system automatically detecting the face, and allowing for user refinement of this face, 
followed by the user selection of a set of automatic facial operations (such as below-eye 
lift, forehead lift, mid-brow lift, eyebrow lift, inner cheek lift, outer cheek lift, chin 
restoration, lip augmentation) and operation strengths (none, subtle, moderate, and max). 
5 The user then selects a "Show Me" button or equivalent which initiates the processing of 
the user uploaded image automatically according to the user operation specification, 
showing the resulting image once the processing is complete. 

An example of the interface for a selective automated facelift is shown in FIG* 15. 

Details of Selective Face Modification Procedures 

10 The following outlines the details of the specific facial modification procedures. For the 
procedures, the first step is generally to take an image I(x,y) submitted by the user on a 
website, sent by a mobile phone, or other means, and to compute (1) the location of the 
face, and (2) to extract the facial sub-image to form the face image F(x,y). Then as 
previously described in FIG. 1, the face image is smoothed by convolving it with a 

15 smoothing filter, as follows: 

Smoothed face=S (x,y)=F (x,y) * Q(x,y) 

Where denotes a two dimensional convolution operation and Q(x,y) is a 

smoothing/averaging mask. 

Once the smoothed face is obtained through the above filtering procedure, the left eye, 
20 right eye, nose, and lip sub images are also found in the original face image F(x,y) using 
the approach described previously in this disclosure. The sub images are denoted as 
LE(x,y), RE(x,y), N(x,y), and L(x,y), in order to denote the left eye, right eye, nose, and 
lip sub images respectively. 

One or more of the specific facial modification operations are then performed using the 
25 above sub images. As explained earlier the facial modification operations are not 
exhaustive, and others as possible. The specific operations described herein are: (i) 
below-eye lift, (ii) forehead lift, (iii) mid-brow lift, (iv) inner cheek lift, (v) outer cheek 
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lift, (vi) chin restoration, (vii) eyebrow lift, (viii) lip augmentation, and (ix) combination 
of the above. 

Below-eye Lift 

A below-eye lift consists of lifting or removing the wrinkles directly around and below 
the eyes. This is achieved by: 

1. Applying a below-eye mask to the original face image F(x,y) (applying implies 
pixel-by-pixel multiplication) (i.e. U(x,y)=F(x,y) x M(x,y) ). 

2. Applying the inverse (i.e. one minus) of the below-eye mask to the smoothed face 
image S(x,y) (i.e. V(x,y)=S(x,y) x (1-M(x,y)) ). 

3. Adding the resulting sub images of 1 and 2 (P(x,y)=U(x,y)+V(x,y) ). 

4. Combining the original face image F(x,y), times a strength weight, plus the result 
of 3, times one minus the strength weight (i.e. Z(x,y)=aF(x,y)+(l-a)P(x,y) ). 

(note: the strength factor 'a' is determined based on the strength of the operation. 
As an example, for operation :NONE, 'a' would be 1, for operation: SUBTLE, 'a' 
could be 0.8, for operation:MODERATE, 'a' could be 0.6, and for 
operation: MAX, 'a' could be OA). 

5. The left eye, right eye, nose, and lip sub images are blended on top of the 
processed face image Z(x,y) using the feature blending approach described 
previously in this disclosure. 

Forehead Lift 

A forehead lift consists of lifting or removing the wrinkles directly in the forehead area of 
the face. This is achieved by: 

1. Applying a forehead mask to the original face image F(x,y) (applying implies 
pixel-by-pixel multiplication) (i.e. U(x,y)=F(x,y) x M(x,y) ). 
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2. Applying the inverse (i.e. one minus) of the forehead mask to the smoothed face 
image S(x,y) (i.e. V(x,y)=S(x,y) x (1-M(x,y)) ). 

3. Adding the resulting sub images of 1 and 2 (P(x,y)=U(x,y)+V(x,y) ). 

4. Combining the original face image F(x,y), times a strength weight, plus the result 
of 3, times one minus the strength weight (i.e. Z(x,y)=aF(x,y)+(l-a)P(x,y) ). 

(note: the strength factor 'a' is determined based on the strength of the operation. 
As an example, for operationrNONE, c a' would be 1, for operation: SUBTLE, 'a' 
could be 0.8, for operation.MODERATE, 'a' could be 0.6, and for 
operation :M AX, 'a' could be 0.4.) 

5. The left eye, right eye, nose, and lip sub images are blended on top of the 
processed face image Z(x,y) using the feature blending approach described 
previously in this disclosure. 

Mid-brow Lift 

The mid-brow lift consists of lifting or removing the wrinkles directly between the 
eyebrows. This is achieved by: 

1. Applying a mid-brow mask to the original face image F(x,y) (applying implies 
pixel-by-pixel multiplication) (i.e. U(x,y)=F(x,y) x M(x,y) ). 

2. Applying the inverse (i.e. one minus) of the mid-brow mask to the smoothed face 
image S(x,y) (i.e. V(x,y)=S(x,y) x (1-M(x,y)) ). 

3. Adding the resulting sub images of 1 and 2 (P(x,y)=U(x,y)+V(x,y)). 

4. Combining the original face image F(x,y), times a strength weight, plus the result 
of 3, times one minus the strength weight (i.e. Z(x,y)=aF(x,y)+(l-a)P(x,y) ). 

(note: the strength factor V is determined based on the strength of the operation. 
As an example, for operation :NONE, 'a' would be 1, for operation: SUBTLE, 'a' 
could be 0.8, for operation:MODERATE, 'a' could be 0.6, and for 
operation:MAX, c a' could be 0.4.). 
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5. The left eye, right eye, nose, and lip sub images are blended on top of the 
processed face image Z(x,y) using the feature blending approach described 
previously in this disclosure. 

Inner Cheek Lift 

An inner cheek lift consists of lifting the skin and removing the wrinkles directly around 
the lips and nose. This is achieved by: 

1 . Applying an inner cheek mask to the original face image F(x,y) (applying implies 
pixel-by-pixel multiplication) (i.e. U(x,y)=F(x,y) X M(x,y) ). 

2. Applying the inverse (i.e. one minus) of the inner cheek mask to the smoothed 
face image S(x,y) (i.e. V(x,y)=S(x,y) x (1-M(x,y)) ). 

3. Adding the resulting sub images of 1 and 2 (P(x,y)=U(x,y)+V(x,y)). 

4. Combining the original face image F(x,y), times a strength weight, plus the result 
of 3, times one minus the strength weight (i.e. Z(x,y)=aF(x,y)+(l-a)P(x,y) ). 

(note: the strength factor 'a' is determined based on the strength of the operation. 
As an example, for operation:NONE, 'a' would be 1, for operation: SUBTLE, 'a' 
could be 0.8, for operation.MODERATE, 'a' could be 0.6, and for 
operation.MAX, 'a' could be 0.4.) 

5. The left eye, right eye, nose, and lip sub images are blended on top of the 
processed face image Z(x,y) using the feature blending approach described 
previously in this disclosure. 

Outer Cheek Lift 

An outer cheek lift consists of lifting and removing the wrinkles in the outer cheeks area 
of the face, as well as removing imperfections as a result of aging. This is achieved by: 

1 . Applying an outer cheek mask to the original face image F(x,y) (applying implies 
pixel-by-pixel multiplication) (i.e. U(x,y)=F(x,y) x M(x,y) ). 
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Applying the inverse (i.e. one minus) of the outer cheek mask to the smoothed 
face image S(x,y) (i.e. V(x,y)=S(x,y) x (1-M(x,y)) ). 

Adding the resulting sub images of 1 and 2 (P(x,y)=U(x,y)+V(x,y)). 

Combining the original face image F(x,y), times a strength weight, plus the result 
of 3, times one minus the strength weight (i.e. Z(x,y)=aF(x,y)+(l-a)P(x,y) ). 

(note: the strength factor 'a' is determined based on the strength of the operation. 
As an example, for operation :NONE, 'a' would be 1, for operation:SUBTLE, 'a' 
could be 0.8, for operation:MODERATE, 'a' could be 0.6, and for 
operation:MAX, V could be 0.4.) 

The left eye, right eye, nose, and lip sub images are blended on top of the 
processed face image Z(x,y) using the feature blending approach described 
previously in this disclosure. 

Chin Restoration 

Chin restoration consists of lifting the skin and removing the signs of aging around the 
1 5 jaw and chin of the face. This is achieved by: 

1. Applying a chin mask to the original face image F(x,y) (applying implies pixel 
by-pixel multiplication) (i.e. U(x,y)=F(x,y) x M(x,y) ). 

2. Applying the inverse (i.e. one minus) of the chin mask to the smoothed face 
image S(x,y) (i.e. V(x,y)=S(x,y) x (1-M(x,y)) ). 

20 3. Adding the resulting sub images of 1 and 2 (P(x,y)=U(x,y)+V(x,y)). 

4. Combining the original face image F(x,y), times a strength weight, plus the result 
of 3, times one minus the strength weight (i.e. Z(x,y)=aF(x,y)+(l-a)P(x,y) ). 

(note: the strength factor 'a' is determined based on the strength of the operation. 
As an example, for operation:NONE, 'a' would be 1, for operation.SUBTLE, 'a' 
25 could be 0.8, for operation:MODERATE, 'a' could be 0.6, and for 

operation:MAX, 6 a' could be 0.4.) 



3. 
4. 



10 5. 
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5. The left eye, right eye, nose, and lip sub images are blended on top of the 
processed face image Z(x,y) using the feature blending approach described 
previously in this disclosure. 

Eyebrow Lift 

An eyebrow lift consists of lifting the eyebrows in order to reduce and remove the signs 
of aging around the eyebrows. This is achieved by: 

1 . Extending the left eye and right eye locations to cover the eyebrows of the face. 

2. Stretching the extended left and right eye sub images based on the strength factor 
of the operation (i.e. vertically scaling the left and right sub images by a factor 
'a'). 

(note: the strength factor 'a' is determined based on the strength of the operation. 
As an example, for operation:NONE, 'a' would be 1, for operation: SUBTLE, 'a' 
could be 1.05, for operation:MODERATE, 'a' could be 1.1, and for 
operation.MAX, 'a 5 could be 0.15.) 

3. The scaled left eye and scaled right eye sub images are blended on top of the 
original face F(x,y) using the feature blending approach described previously in 
this disclosure. 

Lip Augmentation 

Lip augmentation consists of lifting the lips in order to reduce and remove the signs of 
aging around the lips. This is achieved by: 

1 . Stretching the lip location sub image based on the strength factor of the operation 
(i.e. vertically scaling the left and right sub images by a factor 'a'). 

(note: the strength factor 'a' is determined based on the strength of the operation. 
As an example, for operation:NONE, 'a' would be 1, for operation: SUBTLE, 'a' 
could be 1.05, for operation:MODERATE, 'a' could be 1.1, and for 
operation:MAX, 'a' could be 0.15.) 
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2. The scaled lip sub image is blended on top of the original face F(x,y) using the 
feature blending approach described previously in this disclosure. 

Combination of the above 

The above approaches can be combined by repeatedly applying each operation, using 
5 possibly different strength factors, on a face. Here, for example, the first operation is 
performed on the original face, following by another operation being performed on the 
result of the first operation, and so forth. The end result is an automatic system for 
facelifting and face perfecting which uses the guidelines of a user to perform specific 
facial operations. 

10 It should be understood that the image modifications that replicate cosmetic procedures, 
as described above (i.e. brow lifts, mid-brow lifts, forehead lifts and enhancement, 
under-eye and near-eye skin lifts and enhancements, inner cheek lifts and enhancement, 
outer cheek lifts and enhancement, lip enhancement and augmentation, jaw/chin 
enhancement and restoration or other facial and/or cosmetic operation), can be applied to 

15 digital images of a face to varying degrees. In one particular aspect of the present 
invention, a user can modify the "strength" levels of these particular image modifications, 
for example, by selecting (using a suitable graphic user interface) "no operation", "subtle 
operation", "moderate operation" or "maximum operation". The result of the image 
modification, i.e. the image of the face smoothed, enhanced or augmented in accordance 

20 with the mentioned procedures can be displayed or otherwise made available to the user. 
For example, the modified image can be displayed on a computer or communication 
device, communicated via the communication network as a download, email 
communication or other communication. 

25 Fusion Based Detection Approach 

As mentioned above, the present invention contemplates use of prior art face detection 
techniques. In another aspect of the present invention, a "fusion-based" face detection 
method is provided. Specifically, a method is disclosed for face detection which involves 
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fusion of a plurality of simple face detector techniques described below to provide a face 
detection technique with improved characteristics. 

As shown in FIG. 11a, a face search inside a user submitted image using starts with a 
large box with the height to width ratio being 4 to 3, and with the width being 60% of the 
5 image width, for example. The box is moved to all locations inside the image one pixel 
at a time for greatest accuracy (this can be improved for further efficiency). After the 
entire image has been searched, the width of the box is decreased (again one pixel at a 
time) with the height to width ratio remaining fixed at 4 to 3. The progression is depicted 
in FIG. 11a, FIG. lib and FIG. 11c starting with a search over the entire image with a 
10 large face box, and continuing with smaller face boxes until a suitable hit or a minimum 
face box threshold is reached. 

For all box sizes and box locations, a face score is computed which corresponds to the 
likelihood of a face at that location. The box with the highest score is chosen as the face 
location and size estimate. This is similar to most known template-based face detection 
15 algorithms. Generally speaking, the most difficult and sensitive part of template-based 
face detection is the score computations. 

Face Detection Metrics 

I(x,y) will denote the original image at location (xy). The content of each location are a 
3-element vector of red, green, and blue components (i.e. RGB) for each pixel. *F(xy) 
20 will denote the gradient magnitude (in the vertical direction) of the image I(x,y), and 
consists of a non-negative value for each location (x,y). T(x,y) will be a binary template 
image used to fit a binary template to the face. 

It should be understood that all face score metrics are a function of the location and size 
of the current box. The top left corner of this box is denoted as (x 0 ,y 0 ) , while the width 
25 and height of the box are denoted as W and H, respectively. 

It should also be noted that for visual simplicity, we have used E to denote the expected 
value of a certain variable, where the expectation is usually performed over the x and y 
variables. This theoretical expectation is practically estimated as follows: 
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y x 



where Z(x,y) is some function of x and y, and where c is a normalization constant. 

Several metric approaches are possible. 

A. Symmetry-Based (SYM) Face Detection Metric 

5 This approach is a mirror mean square error measure applied to the image gradient. It 
consists of folding the current face box from the middle and taking the average of the 
pair- wise square difference of the points inside the box that overlap, as shown in FIG. 12. 

The motivation here is that if the box is perfectly symmetrical in the vertical axis running 
through the middle of the box, then it will have the smallest mean square difference. In 
10 other words, this is a measure of horizontal symmetry of the current face box. 

For the actual metric, a constant is optionally added to the mean square difference and 
invert, so that a higher metric is indicative of greater symmetry. The final symmetry 
metric is: 

^ sym (*o ,y„W,H) = (\ + E^(x, y)-V(W + 2x 0 - x, y)\y 

f 2 x 0 +W 1 2-1 y Q +H -1 

= 1+ 77^ Z y Zf¥(x,y)->¥{W + 2x 0 -x,yy 

15 B. Template Subtraction (TS) Face Detection Metric 

The template subtraction metric measures the degree to which the current face box 
resembles a face. It consists of applying a binary face template which is used to 
differentiate between the high gradient and the low gradient regions of a face. 

A simple template (FIG. 13a) was utilized, though other choices (e.g., FIG. 13b) would 
20 yield similar results. These models were based on general characteristics of the face (i.e. 
drawn by the author in a matter of seconds), and were not in any way trained or 
optimized for the face detection task. 
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The template subtraction metric can be simply stated as the average gradient magnitude 
of the pixels corresponding to the white (1) template pixels, minus the average gradient 
magnitude of the pixels corresponding to the black (0) template pixels. In other words, 
the template subtraction metric can be defined as: 



= E[¥(x,y) | T(x,y) = l]-E[¥{x 9 y) \ T(x,y) = 0] 

x Q +W-ly Q +H-\ x 0 +W-ly Q +H-l 

2 ^(x,y)-T(x,y) X E^OO'O-ZW)) 

„ x=x 0 y=y 0 x=x 0 y=yp 



x Q +W-\ y 0 +H-\ 

x=x 0 y=y () 



x 0 +W-\y 0 +H-l 
x=x 0 y=y 0 



C. Template Ratio (TR) Face Detection Metric 

The template ratio, which is another template based metric, is the average gradient 
magnitude of the pixels corresponding to the white (1) pixels of the template divided by 
the sum of both the average gradient magnitude of the white (1) template pixels and the 
average gradient magnitude of the black (0) template pixels, as defined below: 



E[¥(x 9 y)\T(x 9 y) = l] 



E[W(x, y) | T(x, y) = l]+ E[W(x 9 y) | T(x, y) = 0] 



fx n +W-\ y 0 +H 



1 + 



x 0 +W-\ y 0 +H-\ 



X ^T(x,y) - X Y,V(x,y)-(l-T(x,y)) 

\ x=x () y-=y 0 J x=x 0 y=y 0 

(x Q +W-\ y Q +H-\ \ x Q +W-\ y 0 +H-l 

X ^(l-nx.y)) ■ 2 ^(x,y)-T(x t y) 

\ x=x 0 y=y Q J *=* 0 y^y^ 



D. Skin-Detector-Based (SKIN) Face Detection Metric 



A pixel skin detector was employed to find the skin-like regions inside the image using a 
known technique, and to use the amount of skin in each test patch as an indication of the 
likelihood of a face. 



The pixel (x,y) of image I(x,y) is skin, or skin(/(x,^)) = 1, if the red (R), green (G), and 
blue (B) components of that pixel obey the following conditions, for example: 
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R>95 and G>40 and B>20 and R-G>15 and R>B, or 
R>220 and G>210 and B>170 and |R-G|<=15 and R>B and G>B 
The skin-based face detection metric can thus be defined as: 

F skin (x 0 ,y 0 ,W,H) = E[skin(l(x,y))] 

YY x=x 0 y=y Q 

5 is. Eye-Lip Total Symmetry (ELTS) Face Detection Metric 

The ELTS metric measures the ratio of the sum of gradients in the top half of the face to 
the sum of gradients in the whole face, as defined below: 

j7 „ w m\ gfefoz) I x >y in tQ P half of face 3 

WWo.^tf)- E[V(x,y)] ~ 

x Q +W~\ y 0 +H i 2-1 



Ideally, a face should have strong gradients around the eyes and the lips/nose, making the 
10 ideal ELTS measure at around 0.5. As a result, the following adjustment is done to the 
final ELTS measure: 

^elts (•) = min(F ELTS (0,1 - ^ELTS O) 
F. Eye Total Symmetry (ETS) Face Detection Metric 

Similar to the ELTS, the ETS measures the symmetry of the total gradients in the top half 
15 of the face. It is the ratio of the gradient sum in the top left quadrant of the face to the 
gradient sum of the top half of the face, as defined below: 
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^ets(*o .Vo w h) = y)\ x >y in tQ p left q uadrant l 

E\V(x 9 y) | x 9 y in top half of face] 

x 0 +WI2-\ y Q +HI2-\ 

— x 0 +W-\y 0 +H/2-\ 
x =x Q y=y Q 

As before, in an ideal case the ETS measure should be a 0.5. Consequently, the 
following adjustment is performed to the ETS measure to ensure that its maximum value 
is 0.5: 

F ETS (.) = min(F ETS (.),1 - F ETS (.)) 

G. Lip Total Symmetry (LTS) Face Detection Metric 

Just like the ETS, the LTS measure the symmetry of the gradient sums in the bottom half 
of the image, as defined below: 

F (x y W H)~ E ^( x > I y * n b° ttoni l e ft quadrant] 
lts o*yo> » ^ y)\x,y in bottom half of face] 

x 0 +W/2-l y Q +H-\ 

^ x=x 0 y=y 0 +H/2 
— x^+W-X y 0 +H-l 

x=x 0 y=y 0 +H/2 

As before we adjust the LTS such that its maximum and ideal value is 0.5, as follows: 

F LTS (.) = min(F LTS (.),1 - F LTS (.)) 

H. Fusion Face Detection Metric 

It was determined that combination of the above parameters results in very reliable face 
detection results relative to the results achieved by prior art methods. The following 
combined detector is a fusion of five of the above metrics described above: 

F Vision (0 = ^skin (0 * ^sym (0 * ^TS (0 ' ^TR (0 * ^ELTS O 
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The fusion face detection metric, while only utilizing five detectors in its face score 
calculation, utilizes other metrics for optimization, as described below. 

Test 

A face detection experiment was performed on a set of 30 faces. These faces were 
5 mostly frontal views without any rotation. Also, each image contained exactly one face. 
As a result, the reported results include only the detection rate, since ROC curves, 
number of false positives, and number of false negatives here are unnecessary. In 
essence, the number of false negatives (i.e. the missed faces) will be the same as the 
number of false positives (i.e. the incorrect face position estimates for the missed faces) 
10 and therefore approximately equal to 100% minus the detection rate. 

The face detector approaches discussed above were tested based on their reliability and 
accuracy. Reliability was measured as the percentage of correctly detected faces (based 
on the manual markings of the face in each image). A face was deemed to be correctly 
detected if the left, top, right, and bottom boundaries of the detected face were all less 
15 than 10 pixels away from the boundaries of the manually marked faces. 

Another measure, related to the accuracy of the detected faces, consisted of the root mean 
square error (RMSE) of the face box coordinates. This value was calculated as the square 
root of the mean square distance error of the top-left corner of the face box plus the mean 
square distance error of the bottom-right corner of the face box. The RMSE was 
20 measured separately for the correctly detected faces and the incorrectly detected faces. 



The table below summarizes the results of the experiment. 



Face Detection Algorithm 


Detection 
Rate 


RMSE for correct 
detections (in pixels) 


RMSE for incorrect 
detections (in pixels) 


Symmetry 


0% 


N/A 


133.99 


Template Subtraction 


83.33% 


5.33 


57.59 


Template Ratio 


33.33% 


4.70 


139.15 


Skin Detector 


10% 


10.05 


57.55 


Eye-Lip Total Symmetry 


0% 


N/A 


92.82 


Eye Total Symmetry 


0% 


N/A 


83.46 


Lip Total Symmetry 


0% 


N/A 


93.00 


EigenFace-based 


23.33% 


5.03 


42.48 
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Convolutional Neural Networks 


86.67% 


8.00 


23.23 


Fusion 


93.33% 


4.96 


69.38 



Table 1. Face detection rates for various face detectors using a set of 30 face images. 

As shown, the fusion of the simple face detectors outperforms prior art algorithms. The 
only two errors that were made by the fused detector are shown in the figure below (as 
the rightmost images in the bottom row). 



5 The described fusion detector has a high detection rate of any approach, but it also has a 
very low RMSE measure. It has a detection rate that is 10% higher than any of the 
simple approaches and almost 7% higher than the neural network based approach which 
was extensively trained. 

Optimizations 

10 The fusion of multiple detectors not only can improve the reliability and accuracy of the 
face detector but it can also improve its efficiency. By performing simple checks to test 
the validity of each test patch, further and more complicated computations can be avoided 
for non-face patches. 

By running the face detectors on a variety of faces, it was determined that for correct face 
15 patches the following conditions are almost always met: F skin (.) > 0.65 ; F TR (.)>0.5; 

F ELTS (.) > 0.4 ; F ETS (.) > 0.4 ; and F LTS (.) > 0.4 . 

As a result, at every point of the computation if the appropriate parameter did not surpass 
its corresponding condition, further computations on the current face box were skipped. 
This was done for a selective subset of the conditions as well as for all conditions 
20 combined, as shown in the table below. The timing data below is based on a GNU™ C 
implementation of the face detection algorithm running on an Intel™ P4 2.2GHz 
processor with 1GB RAM. All images where resized to a width of 100 prior to 
performing face detection. 



Optimization 


Average Execution 
Per Face Time 


% Improvement 


None 


12.17s 


0% 


F skm (.)>0.65 


5.65 s 


54% 
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F TR (.)>0.5 


10.42 s 


14% 


F ELTS (.)>0.4 
F Brs (.)>0.4 
F LTS (.)>0.4 


11.25 s 


8% 


All of the above 
conditions 


4.92 s 


60% 



Table 2. Performance of the fusion face detector with continuation conditions which reduce the 

detection time. 

By avoided a costly search in regions where the likelihood of a face is small, the 
conditions above increase the speed of the face detection process by 60%. 

5 With further optimizations, including searching every 2-4 (instead of 1) pixels depending 
on the box size, for example, the average execution time per image becomes slightly 
more than 1.5s, which is in the range of acceptability for live web applications. A FPGA 
(Field Programmable Gate Array) implementation of the above algorithm is also possible, 
and it is estimated that the algorithm described above running on a single state-of-the-art 
10 FPGA will be able to handle 1000 face detection requests per second. 

In addition to the test described above, several experiments were performed using a 
frontal face database consisting of 450 color images of 27 individuals in various lighting 
conditions. The fusion detector correctly detected 404 of the 450 images, without any 
training or pre-processing, which corresponds to a 90% detection rate. 

1 5 Since many of the detection errors where due to poor lighting conditions, or the result of 
faces that were smaller than the smallest search box, ill-conditioned images were 
removed from the dataset and a second experiment involving 426 images was performed. 
The new detection rate was 404 out of 426 images, which corresponds to a 95% detection 
rate. 

20 The modification utility (106) of the present invention is programmed in a manner know 
to those skilled in the art, depending on the nature of the various system implementation 
of the present invention, including those described above. In one particular embodiment 
of the modification utility (106), it includes an automatic face detection utility (using the 
fusion approach described above), a virtual facelift and selective facelift utility (described 

25 above), a feature detection utility (as described above), a feature replacement utility (as 
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outlined above), as well as a software components in javascript/ajax/php/C/C-i-+ for the 
web/email interface for the mentioned applications, as well for interfacing the web 
presence and email presence of the invention with the said utilities. 

It will be appreciated by those skilled in the art that other variations of the embodiments 
5 described herein may also be practised without departing from the scope of the invention. 
The within disclosure discusses certain system components, software components, or 
other utilities, as means for illustrating the operation and implementation of the present 
invention. It should be understood that the present invention is not limited to particular 
software, system, or network architectures or configurations, or to specific allocations of 

10 resources or functionality as between particular system components, software 
components, or other utilities. It should be understood that one or more system 
components, software components, or other utilities, could be provided as a greater or 
lesser number of system components, software components, or other utilities. As 
discussed above, the modification utility of the present invention, or aspects thereof, 

1 5 could be pre-loaded on a computer, or pre-loaded on mobile devices. The functionality 
described can be provided based on numerous architectures for delivering functionality, 
including but not limited to a client-server architecture, web service architecture (with or 
without resident software components), and standalone computer systems. While add-on 
utilities have not been discussed, it would be obvious to a person skilled in the art that 

20 various add-on utilities can be included into or linked to the modification utility for 
example to include alternate face detection or facial feature detection functionality, and 
additional face modification features (such as additional smoothing, specific color 
blending techniques and the like). The present invention provides for certain automatic 
face modification techniques, and invoking user intervention in certain cases. The way in 

25 which user intervention is involved and processed, and the tools used for user 
intervention, can be modified without departing from the scope of the invention. For 
example, additional tools for enabling user directed face modification in addition to the 
automated face modification described in the present disclosure, are contemplated. The 
present invention is not limited to any particular software structure, including a modular 

30 structure. Furthermore, the present invention could be implemented on a variety of 
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hardware structures including digital signal processors, Field Programmable Gate Arrays 
(FPGAs), or Very Large-scale Integrated Circuits (VLSI). 
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CLAIMS 

What is claimed is: 

1 . A method for modifying digital images comprising: 

(a) detecting a face in a first digital image and optionally detecting a face in a 
second digital image, if the location of face(s) in the first digital image or 
the second digital image has not already been established.; 

(b) establishing regions of interest in the face in the first digital image and 
optionally establishing regions of interest in the face in the second digital 
image; 

(c) detecting features in the regions of interest in the face in the first digital 
image and optionally detecting features in the regions of interest in the 
face in the second digital image; and 

(d) modifying the first digital image by either: 

(i) matching and replacing one or more of the features in the face in 
the first digital image with the one or more features in the regions 
of interest in the face in the second digital image, thereby defining 
a modified digital image; or 

(ii) isolating from modification the regions of interest in the first 
digital image, modifying the first digital image other than the 
regions of interest, and replacing the regions of interest into the 
modified first digital image. 

2. The method of claim 1 wherein the features include eyes, eyebrows, nose, mouth, 
lips or hair. 

3. The method of claim 1 further comprising blending, recoloring, shifting or 
resizing the one or more features in the face in the second digital image. 



WO 2007/128117 



-49- 



PCT/CA2007/000784 



4. The method of claim 1 further comprising adjusting size and location of the one 
or more features in the face in the second digital image in the modified digital 
image to increase the perceived beauty of a face in the modified digital image. 

5. The method of claim 1 further comprising color adjusting the one or more 
features in the face in the second digital image to correspond with the face in the 
first digital image, or color adjusting the features of the face in the first digital 
image to correspond to the modified first digital image. 

6. The method of claim 5 wherein the color adjusting is achieved by performing 
color histogram transformations. 

7. The method of claim 6 wherein the color histograms favour outside areas of the 
one or more features. 

8. The method of claim 1 further comprising blending the one or more features in 
the face in the second digital image with the face in the first digital image. 

9. The method of claim 8 wherein the blending is achieved by gradient filling using 
a blending mask. 

10. The method of claim 9 wherein the blending mask corresponds to the regions of 
interest in the face in the first digital image. 

1 1 . The method of claim 1 further comprising resizing the one or more features in the 
face in the second digital image to correspond with the one or more features in the 
face in the first digital image prior to matching and replacing. 

12. The method of claim 1 further comprising shifting the one or more features in the 
face in the second digital image to correspond with the one or more features in the 
face in the first digital image prior to matching and replacing. 

13. The method of claim 1 further comprising blending the face in the first digital 
image prior to matching and replacing one or more of the features in the face in 
the first digital image with the one or more features in the face in the second 
digital image. 
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14. The method of claim 1 wherein location of the face in the first digital image 
and/or location of the face in the second digital image is determined including by 
using user input. 

15. The method of claim 1 wherein the modified first image is used to provide a 
5 virtual facelift of the face of the first digital image. 

16. The method of claim 1 wherein the location of the face in the first digital image 
and optionally the location of the face in the second digital image are determined 
by calculating edge intensities and by using a set of deterministic rules for edges 
within a face. 

10 17. The method of claim 1 wherein the regions of interest in the face in the first 
digital image and optionally the regions of interest in the face in the second digital 
image are established using a generic face template. 

18. The method of claim 17 wherein the generic face template is proportional in size 
to the face in the first digital image. 

15 19. The method of claims 1 8 wherein the features in the regions of interest in the face 
in the first digital image and optionally the features in the regions of interest in the 
face in the second digital image are detected by calculating gradient intensities 
and a gradient template. 

20. The method of claim 19 wherein the features in the regions of interest in the face 
20 in the first digital image, and optionally the features in the regions of interest in 

the face in the second digital image, are also detected by deviation from skin 
color. 

21. The method of claim 1 wherein selected regions of the face in the first digital 
image are modified to produce a facelift effect in those selected regions. 

25 22. The method of claim 1 further comprising the step of making the modified digital 
image or modified first digital image available on a computer or wireless device 
via a computer network or wireless network. 
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23. The method of claim 1 wherein the face in the first digital image and optionally 
the face in the second digital image are detected using a fusion face detection 
means. 

24. The method of claim 23 wherein the fusion detection means consists of one or 
5 more of symmetry-based face detection, template subtraction face detection, 

template ratio face detection, skin-detector-based face detection, eye-lip total 
symmetry face detection, eye total symmetry face detection, or lip total symmetry 
face detection. 

25. A method for modifying a digital image comprising: 
1 0 (a) detecting a face in the digital image; 

(b) establishing regions of interest in the face in the digital image; 

(c) detecting features in the regions of interest in the face in the digital image; 

(d) augmenting the face in the digital image by smoothing selective regions; 
and 

1 5 (e) replacing the features in the face in the digital image, thereby defining a 

modified digital image. 

26. The method of claim 25 wherein the regions of interest in the face in the digital 
image are established using a generic template. 

27. The method of claim 26 wherein the generic template is proportional in size to the 
20 face in the digital image. 

28. The method of claims 27 wherein the features in the regions of interest in the face 
in the digital image are detected by calculating gradient intensities using one or 
more boxes in the generic template. 

29. The method of claim 1 wherein the face in the first digital image and optionally 
25 the face in the second digital image is detected using a fusion detection means. 
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30. The method of claim 29 wherein the fusion detection means consists of one or 
more of symmetry-based face detection, template subtraction face detection, 
template ratio face detection, skin-detector-based face detection, eye-lip total 
symmetry face detection, eye total symmetry face detection, or lip total symmetry 
face detection. 

31. A system for modifying digital images comprising: 

(a) a computer linked to a database, the computer including or being linked to 
a utility for enabling one or more users upload, store, retrieve, email, 
display and/or manage digital images; 

(b) a modification utility linked to the computer, the modification utility being 
operable to provide instructions to the computer that enable the computer 
to: 

(i) detect a face in a first digital image and optionally detect a face in 
a second digital image, if the location of the faces in the first 
digital image or the second digital image has not already been 
established; 

(ii) establish regions of interest in the face in the first digital image and 
optionally establish regions of interest in the face in the second 
digital image; 

(iii) detect features in the regions of interest in the face in the first 
digital image and optionally detect features in the regions of 
interest in the face in the second digital image; and 

(iv) modify the first digital image by either: 

(A) matching and replacing one or more of the features in the 
face in the first digital image with the one or more features 
in the face in the second digital image, thereby defining a 
modified digital image; or 
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(B) isolating from modification the regions of interest in the 
first digital image, modifying the first digital image other 
than the regions of interest, and replacing the regions of 
interest into the modified first digital image. 

5 32. The system of claim 31 wherein the modification utility further includes means 
for processing digital images consisting of one or more of the following 
operations performed on the digital images: resizing, histogram equalization, 
compressing, histogram transformation, color adjustment, recoloring, correlating 
in two dimensions, convolving in two dimensions, blending, edge extraction, 
10 cropping, smoothing, or blending based on a template. 

33. The system of claim 3 1 wherein the computer is a server computer, and the one or 
more users are associated with remote computers linked to the server computer 
via a communication network, and the server computer is enable to provide access 
from the remote computers to computer resources that enable the one or more 

15 users to upload, store, retrieve email, display or manage digital images, and 

wherein the server computer is operable to receive image modification 
instructions from the remote computers. 

34. The system of claim 33 wherein the server computer is interoperable with remote 
computers consisting of a personal computer, mobile phone, Internet device, 

20 handheld computer or a kiosk. 

35. The system of claim 34 wherein the server computer enables the one or more 
users to interactively access digital images via the remote computer, and the 
server computer is operable to receive image modification instructions from the 
remote computer. 

25 36. The system of claim 3 1 wherein the computer is part of a kiosk. 



37. 



The system of claim 3 1 wherein the features include eyes, eyebrows, nose, mouth, 
lips or hair. 
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38. The system of claim 37 wherein modification utility is operable to enable the one 
or more users to select the features to match and replace. 

39. The system of claim 33 wherein at least one of the remote computers is a wireless 
device, and the server computer is operable to receive image modification 

5 instructions from a user via the wireless device, and the server computer is 

operable to deliver the modified digital image or modified first digital image to 
the wireless device via a wireless network. 

40. The system of claim 31 wherein the computer is a personal computer and the 
database is an online database linked to a web server, the web server being 

1 0 operable to enable the personal computer to upload, store, retrieve, email, display 

and/or manage digital images stored to the online database. 

41. The system of claim 31 wherein the modification utility is embedded in field 
programmable gate arrays. 

42. The system of claim 3 1 further comprising a camera to generate digital images for 
1 5 modification. 

43. The system of claim 31, wherein the modification utility is operable to enable the 
computer to: 

(i) permit the one or more users to interactively access the digital 
images and select one or more digital images for modification; 

20 (ii) detect a face in the digital image; 

(iii) establish regions of interest in the face in the digital image; 

(iv) detect features in the regions of interest in the face in the digital 
image; 

(v) blend the face in the digital image; 

25 (vi) replace the features in the face in the digital image, thereby 

defining a modified digital image; and 
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(vii) display the modified digital image to the user. 

44. The system of claim 3 1 wherein the system further comprises a detection utility 
linked to the computer being operable to provide instructions to the computer that 
enable the computer: 

5 (a) To progressively scan a digital image with a detection box of varying size; 

and 

(b) using two or more detection techniques embodied in the detection utility, 
such detection techniques consisting of (i) symmetry-based detection, (ii) 
template subtraction detection, (iii) template ratio detection, (iv) skin- 
10 detector-based detection, (v) eye-lip total symmetry detection, (vi) eye 

total symmetry detection, or (vii) lip total symmetry detection, 
establishing face scores and face coordinates by operation of the detection 
techniques, thereby detect the existence of a face in the digital image. 



45. 



The system of claim 44 wherein the server computer is operable to remotely 
1 5 return the face score and the face coordinates to a user. 

46. The method of claim 1 wherein the location of the features are used to re-evaluate 
the location of the face, thereby enabling interactive face detection and facial 
feature detection. 

47. The method of claim 1 wherein after the facial features are detected, a three 
20 dimensional projective transformation is performed on the face to make the face 

frontal, followed by face modification. 

48. The system of claim 31 wherein the computer is operable to perform 
automatically virtual facelift operations on digital images of faces upon a user 
uploading or selecting a digital image of a face. 

25 49. The system of claim 3 1 wherein the modification utility is operable to simulate 
the effects of different cosmetics and plastic surgery products automatically using 
computer vision and artificial intelligence techniques. 
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50. The system of claim 31 wherein the modification utility is operable to simulate 
the post operation appearance of a face prior to an operation taking place by 
selectively modifying a digital image of the face using computer vision and 
artificial intelligence techniques. 

5 51. The system of claim 31 wherein revenue generation is linked to image 
modification by one or more entities receiving a fee by permitting one or more 
users to pay a fee to modify the digital image of a face provided to the system. 

52. The system of claim 31 wherein the one or more users submit a digital image of a 
face with a set of image modification criteria, or linked to a set of image 

10 modification criteria, and the computer is operable to provide an Internet link to a 

dynamic image that changes over time based on the set of image modification 
criteria, or a subset of such image modification criteria selected by the one or 
more users. 

53. The system of claim 31 wherein the system is operable to enable automatic 
1 5 extraction of facial features (eyes, eyebrows, nose, and mouths/lips), followed by 

optional face smoothing and selective replacement of the facial features in order 
to perform a virtual facelift and skin reconditioning on a digital image of a face. 

54. The system of claim 52 wherein the system is operable to receive user adjustment 
instructions to assist in modification of the digital image of the face. 

20 55. The system of claim 31 wherein the computer includes a digital pad, or electronic 
device including a display screen and an input means for entering user requested 
face modifications; and the system includes a camera. 

56. The system of claim 31 wherein the modification utility is operable to enable the 
one or more users to select the areas of a digital image of a face for modification, 

25 and the degree of modification to be applied to selected areas. 

57. The system of claim 31 wherein the modification utility is operable to apply one 
or more virtual face lift operations in one or more areas of the digital image of the 
face including the eyebrows, the mid-brow, forehead, below the eye, around the 
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eye, inner cheek, outer cheek, nose, below nose, jaw, chin, lip, eye, hair, mouth, 
and below mouth areas. 

58. The system of claim 31 wherein the modification utility embodies a fusion face 
detection means for detecting a face in a digital image. 

59. The system of claim 57 wherein the fusion detection means embodies symmetry- 
based face detection, template subtraction face detection, template ratio face 
detection, skin-detector-based face detection, eye-lip total symmetry face 
detection, eye total symmetry face detection, or lip total symmetry face detection, 
and dynamically applies to or more such detection techniques. 

60. The system of claim 31 wherein the digital images include one or more video 
frames. 

61. A computer program product for enabling the modification of digital images 
comprising: 

(a) a computer readable medium bearing software instructions; and 

(b) the software instructions for enabling the computer to perform 
predetermined operations, the predetermined operations including the 
steps of: 

(i) detecting a face in a first digital image and optionally detecting a 
face in a second digital image, if the location of the faces in the 
first digital image or the second digital image has not already been 
established; 

(ii) establishing regions of interest in the face in the first digital image 
and optionally establishing regions of interest in the face in the 
second digital image; 

(iii) detecting features in the regions of interest in the face in the first 
digital image and optionally detecting features in the regions of 
interest in the face in the second digital image; and 
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(iv) modifying the first digital image by either: 

(A) matching and replacing one or more of the features in the 
face in the first digital image with the one or more features 
in the regions of interest in the face in the second digital 

5 image, thereby defining a modified digital image; or 

(B) isolating from modification the regions of interest in the 
first digital image, modifying the first digital image other 
than the regions of interest, and replacing the regions of 
interest into the modified first digital image. 
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FIG. 1B 
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FIG. 6c 
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see all document 
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Box No. II Observations where certain claims were found unsearchable (Continuation of item 2 of the first sheet) 

This international search report has not been established in respect of certain claims under Article 17(2)(a) for the following 
reasons : 

1. [ ] Claim Nos. : 

because they relate to subject matter not required to be searched by this Authority, namely : 



2. [ ] Claim Nos. : 

because they relate to parts of the international application that do not comply with the prescribed requirements to such an extent 
that no meaningful international search can be carried out, specifically : 



3. [ ] Claim Nos. : 

because they are dependant claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 



Box No. Ill Observations where unity of invention is lacking (Continuation of item 3 of first sheet) 

This International Searching Authority found multiple inventions in this international application, as follows : 
See extra sheet. 



1 . [ x ] As all required additional search fees were timely paid by the applicant, this international search report covers all 

searchable claims. 

2. [ ] As all searchable claims could be searched without effort justifying additional fees, this Authority did not invite 

payment of additional fees. 

3. [ ] As only some of the required additional search fees were timely paid by the applicant, this international search report 

covers only those claims for which fees were paid, specifically claim Nos. : 



4. [ ] No required additional search fees were timely paid by the applicant. Consequently, this international search report is 
restricted to the invention first mentioned in the claims; it is covered by claim Nos. : 

Remark on Protest [ ] The additional search fees were accompanied by the applicant' s protest and, where applicable, 
the payment of a protest fee. 
[ ] The additional search fees were accompanied by the applicant's protest but the applicable protest 

fee was not paid within the time limit specified in the invitation. 
[ ] No protest accompanied the payment of additional search fees. 
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Group A of claims 1-24, 29-61 refers to a method, a system and a computer program product for modifying digital images, the 
method comprising: 

a) detecting a face in a first digital image; 

b) establishing regions of interest in the face; 

c) detecting features in the regions of interest; and 

d) modifying the first digital image by either 

i) matching and replacing one or more features in the face in the first digital image, thereby defining a modified digital 
image; or 

ii) isolating from modification the regions of interest in the first digital image, modifying the first digital image other than 
the regions of interest, and replacing the regions of interest into the modified first digital image. 

Group B of claims 25-28 refers to a method for modifying a digital image, the method comprising: 

a) detecting a face in a first digital image; 

b) establishing regions of interest in the face; 

c) detecting features in the regions of interest; 

d) augmenting the face by smoothing selective regions; and 

e) replacing the features thereby defining a modified digital image. 

Steps "a" to "c" in the two groups are identical. However, they only set some prerequisite conditions before the processing per se, 
and the features characterizing the processing in the two groups are different and do not establish a common inventive link. 
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